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ABSTRACT 



Often a written test Is used as an Inexpensive substitute 
for a performance measure* A specified minimum performance level 
or probability of successful performance can be translated into a 
minimum passing score for the written test most efficiently by 
measuring the performance of students whose written test scores 
are near the desired cutoff score* Stochastic approximation 
methods accomplish this purpose* The up-and-down method and 
the Robblns-Monro process are presented^ discussed* and comi^t^red* 
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Written tests in education (and in other fields as well) are frequently 
used to make decisions that require the persons tested to be divided into tuo 
groups on the basis of their level of competence* In many cases the written 
test serves as an inexpensive substitute for an expensive individual assess- 
tnent or performance test. For example, a school might want to determine which 
students need instruction in basic reading skills* The school cannot afford 
to have a group of experts assess the skills of each pupil individually , but 
the school can afford to have all the pupils take a written test* those who 
score below a certain level on the written test will be given the basic instruc- 
tion* But h€W should the school determine that score level? 

A similar problem often arises in the case of professional certification 
and licensing examinations* Cost considerations rule out the possibility of 
having each applicant take a full-scale performance test covering an adequate 
sample of the tasks involved in the practice of the profession* Therefore 
written tests are commonly used* In this case, the setting of standards 
for acceptable performance would seem to be a simple exercise of profes- 
sional judgmofit by the licensing agency* However, the written test is 
only an indirect measure of the skills to be tested* How can the agency^s 
experts translate their judgment of a minimum acceptable level of actual 
performance into a minimum passing score for Che written test? 

These problems correspond closely to some common problems in biological 
and industrial testing, and the techniques that have been developed for those 
fields can be applied to education as well* For example, biologists frequently 



want to know how large a dose of a drug Is required to produce an observable 
effect on an animal. Individual animals vary In their response to the drug, 
and either the drug or the animals may be too expensive for large-sample 
tests. Engineers often need to know i^at level of an Input variable In an 
industrial process (possibly an amount of an expensive chemical) will pro- 
duce a finished product of a specified flexibility, Itnpact resistance, etc* 
Samples of the product will vary even when the input Is constant, and mea- 
surements of the finished product can be quite expensive* 

In general, the problem Is to determine what level of Input (written 
test score) Is necessary to produce a given response (performance), when 
measurements of the response are difficult or expensive* While the 
educator, unlike the biologist or the engineer, cannot control his Input 
directly, he can control It Indirectly by first administering the written 
test to a large and diverse sample of persons and then using these written 
test scores as a basis for choosing those few persons whose performance will 
be Individually assessed* 

The class of techniques used to solve problems of this type Is called 
stochastic approximation , and the basic method, as applied to educational 
testing. Is as follows, 

1* Select any person* Record his writt^ test score and measure his 
actual performance, 

2. If the first person succeeds on the performance measure (If his 

performance Is above 'the minimum acceptable)-, choose next a person . . 
with a somewhat lower written test score. If the first person falls 
on the performance measure, choose next a person with a somewhat 
higher written test score. 



3* Repeat step 2, choosing the third person on the basis of the second 
person's measured performance* Continue by choosing each person on 
i-he basis of the previous persoa^s measured perfonsanco. 

The advantage of this method of choosing persons for performance mea- 
surement is that It does not spread these expensive measurements over the 
full range of ability, but concentrates them in that portion of the range 
where they are needed to determine a cutcff score* Therefore stochastic 
approximation methods are not appropriate for determining the validity 
of the written test« Validation requires a sample that is representative 
of the population of Interest, while the purpose of stochastic approxlma'- 
tlon is to produce a sample that is unrepresentative, in a way that is 
particularly useful for determining a cutoff score. 

Stochastic approximation techniques can be classified into two types, 
according to the way in which the input is varied* In one type the input 
is varied by fixed steps. After each observation, we move up one step 
or down one step for our next observation. If the observation is a success 
(the person succeeds on the performance measure) we move down try a 
person with a written test score one step lower)* If the observation is 
a failure^ we move up. This technique is called the *'up*and-down method" 
(Dixon and Mood, 1948). There are several variations of the up-and-down 
method which are intended to make it either more flexible or more efficient 
some of these will be discussed later in this paper. 

In the other type of stochastic approximation technique, the input is 
varied by an amount that depends on the difference between the observed 
performance and the minimum acceptable performance. For example, if the 
first person succeeds on the performance measure by a wide margin^ we vill 



move down fairly far on the written test scale to choose the second person, 

+ 

But If the first person barely succeeds on the performance measure, we will 
choose for the second observation a person with a written test score only 
slightly lower than the first person^s. The best known and most thoroughly 
Investigated of these techniques Is the Robbins-Honro process (Bobbins and 
Monro, 1951) • It would seem best suited to situations in which the written 
test has a large number of Items, since It Is based on the assumptloa that 
the input variable Is continuous. 

The test user who has decided to use a stochastic approximation tech- 
nique for choosing a minimum passing score finds himself confronted with 
some specific problems and decisions: 

i 

1. lifhl^h stochastic approximation method should he use? 

2. Hov; large should the steps be? 

3* How many persons should he select for the performance measure? 

Given the data, how should he choose the minimum passing score? 

5> V^at Is the sampling variability of the minimum passing score chosen 
In this manner? How good Is It as an estimate of the '^true*' minimum 
passing score — the score he would choose If he could obtain written 
test and performance scores for all persons In the population? 

These questions are all interrelated, Tliey have been answered In several 
different ways aad are still being Investigated by mathematical statisticians, 
Tlie remainder of ttils paper Is an attempt to present some of the answers In 
a form that will be accessible and useful to educators with some knowledge 

8 



of basic statistical concepts* Derivations and proofs will be omitted; 
referenci^a will be provided for the reader who wishes to Investigate the 
subject more deeply,^ 

Because stoch£iStlc approximation techniques were developed for situa- 
tions other than educational and occupational testing, the more general 
terms **lnput variable** and '^response variable** will sometimes be used In 
place o£ the terms "written test** and **performance measure**, respectively. 
In addition, the term **response curve** will be used to refer to the func- 
tion that gives the expected performance score for any given written test 
score. 

The up-and-down method 

The up-and-down method was devised for use with a dlchotomous response 
variable (performance measure)* To use It with a continuous response variable 
we must Impose an artificial dichotomy. To do so, we specify a particular 
performance level as the minimum acceptable. We then classify any perfor*- 
manco at or above that level as a success and any performance below that 
level as a failure. 

The up-and-down method also requires that the input variable (written 
test score) scale be divided into discrete levels, or '*steps**» The basic 
up-and-down rule directs us to move up one step on the input scale after a 
failure, and down one step after a success* This will cause the written 
test scores of the persons we select to center around the score that 
corresponds to a fifty per cent probability of success on the performance 
measure, (If we are interested in some other probability of success, we 

^ A good starting point for such an investigation is the excellent review 
by Scheber (1973), 
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must use a variation of the method described later in this paper*) 



Table 1 presents the notation we will use In describing statistical 
procedures for the up-^and-down inethod* Notice that If the performance 
measure Is continuous, the decisionmaker must specify both the minimum 
acceptable level of performance and the minimum acceptable probability 
of achieving this level* For example, he might want to estimate the 
written test score that corresponds to an eighty per cent probability 



of achieving a performance score Qf 123 or better* Xn the notation of 
Table 1, he would then specify y^ « 125 and p = *80, Also notice that 
when we specify a minimi^m acceptable probability of success, we are 
referring to the probability of success for the lowest-scoring person 
wp will accept — one whose written test score Is exactly equal to 
the minimun passing score. 



Estimating the true minimum passing score 

At least five distinct: procedures have been recommended for estimating 
the true minimum passing score. The estimates thev yield tend to be close 
to each other, as might be expected, but no two of the procedures yield the 
exactly same estimate in all cases. Four of these procedures will be pre- 
sented here for the basic up-and-down method vith p = ,50; their adaptation 
to variations of the method with p f ,50 will be discussed later, in connec- 
tion with those variations. 
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The procedure for estimating originally sugges»-ed by Dixon and Mood 
(1948) can be expressed as follows* tf there have been more successes than 
failures, take the mean written test score for all persons who failed the 
performance measure and subtract half the step size. If there have been 
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Table 1* Notation 

written test score of the 1 th person selected for performance measurement* 
observed performance of the 1 th person* 

random variable resulting from variation In performance between persons 
with written test score , from Instability of performance, and from 
unreliability of performance measurement* 

random variable resulting from the fact that the selection of person 1 
depends on the observed performance of person 1-^1* 
y^ minimum acceptable performance level (performance level required for 
success) * 

p minimum acceptable probability of succe$;s* 

true minimum passing score ; the written test score such that, In the 
entire population of Interest, Prob (Y^ £ y^ 1 ^ ' ^ P * 

x^ minimum passing score estimated from observed sample data* 

rantiom variable resulting from variability In the data used to estimate 
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more failures than successes, take the mean written test score for those 
persons who succeeded on the performance measure and add half the step 
size. 

A second estimation procedure was suggested by Brownlee, Hodges, and 
Rosenblatt (1953). The procedure they recommended Is to disregard the first 
run of successes or failures, except for the last observation In that run, 
and average the written test scores of all the rest of the persons selected 
(Including that of the person whose perfo rmance would be measured next 1 f 
the procedure were continued), if the first k persons all succeed (or all 
fail) and an additional n persons ar& selected, then the estimate of Xj. Is 



Notice that only k + n - 1 persons will actually have had their performance 
measured. However, the (k + n)th person Is considered to have been selected, 
because his written test score will have been determined by the (k + n - l)st 
person^s performance, 

A third estimation procedure is Wetherlll^s **peaks-and-valleys** method, 
suggested by Wetherill and Uvltt (1965) and Wetherill (1975). A **peak** 
is any failure preceded by a success; a **valley** Is any success preceded by 
a failure. The descriptive terms derive from the fact that a **peak'* represents 
a person vith a written test score higher than those of the persons selected 
j'^st before and just after him; a ''valley** is exactly the opposite. The 
estimate of x^ Is simply the mean written test score for all the **peaks" 
and %alleys**, 

A fourth estimation procedure is the use of the *'Speannan-Karber estimate*^ 
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This procedure was originally devised before the introduction of stochastic 
approximation techniques; its use in connection with the up-and'-down method 
was investigated by Tsotakawa (1967)* The estimate is 

where ^^^^ is the lowest written test score among all persons actually mea- 
sured with the performance measure, d is the step size,^p^ is the proportion 

of success at the jth written test score level, and the sum is over all the 
different written test score levels at which persons were selected and mea- 
sured for actual performance* For example, if the persoi s whose performance 

was measured all had written test scores of 70, 80, 90, or 100, then x . would 

min 

be 70 and d would be 10, To find the p^ we would compute the proportion of 

successes at each of the four levels* An equivalent expression for this 
estimate, which may sometimes be more convenient, is 

x^-x +-^d-dVp, 
* max 2 ^ J 

Table 2 presents a set of hypothetical data illustrating the estimation 
of x^ by each of the four procedures* For this particular set of data, the 

Dixon-Mood estimate and the Spearman-Karber estimate yield the same result* 
However, if the ninth person took the performance measure and succeeded, 
the Dixon-'Mood estimate would remain unchanged, while the Spearman-Karber 
estimate would decrease from 51*67 to 50, (Brownlee's estimate would decrease 
from 31,43 to 50, while Wetherill^s would remain unchanged at 52*5*) 

A fifth estimation procedure suggested by Dixon (1965) requires the 
use of tables contained in his article and is not presented here. . 



13 



-10- 



Table 2. Estimates of with the up-and-down method, for p = .50 (hypo- 
thetical data) . 



Person 
1 
2 
3 
4 
5 
6 
7 
8 
9 



Written test score 
70 
60 
50 
40 
50 
60 
50 
60 
50 



Performance 

S (success) 

S 

S 

F (failure) 

F 

S 

F 

S 

not measured 



Dixon-Mood: ^ (40 + 50 + 50) + j (10) =■ 51.67 
Brownlee: j (50 + 40 + 50 + 60 + 50 + 60 + 50) = 51.43 
Wetherill: j (40 + 60 + 50 + 60) = 52.5 
Spearman-Karber: 40-5+10 (1 + j + 0 + 0) = 51.67 



Rir 
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Variance of the up-'and^down estimate 



Procedures have been suggested for estimating the variance of based 

on each of the four procedures discussed In the previous section. The tech- 
nique suggested by Dixon and Hood (1948) for computing the variance of their 
estimate requires some strong assumptions not likely to be satisfied ..n 
practical applications to educational testing; the response cutrve is 
assumed to be a normal cumulative distribution function with known standard 
deviation* (Brownlee, eX ai_, 1953, pointed out that estimation of this 
standard deviation from observed data would require very large samples for 
reasonable precision*) 

A procedure for estimating the variance of Brownlee's sample -aver age 
estimate of was devised by Tsutakawa (1967)* This procedure requires 
us to Identify the most frequently occurring written test score level. We 
then divide the whole sequence of observations into subsequences, ending 
each subsequence as soon as this most frequent level is reached and begin- 
ning the next subsequence with the next person* Let t be the number of 

m 

persons in the mth subsequence, and let be the sum of their written 
test scores* Let s be the number of subsequences* Then we disregard the 



If there is more than one most frequent level (l*e*, a tie), we estimate the 
variance of separately for each of the most frequently occurring levels, 

and average these estimates (Tsutakawa, personal communication, 1975), 



first subsequence and ef^timate the variance of by 



m«2 




s 
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Wetherill and Lavitt (1965) suggest a procedure for estimating the 
variance of Wetherill's peaks -and-va 11 eys estimate which inay be useful if 
the sample size is not too small* They suggest averaging the peaks and 
valleys in pairs, letting the first estimate of be the average of the 

first peak and the first valley, the second estimate be the average of 
the second peak and the second valley, and so on* The sample variance of 
these individual estimates of divided by the number of individual esti' 



mates, is an estiinate of the variance of X^^* If we let P, and V represent 



the kth peak^jind valley, the formula for the estiinated variance of X. is 



The variance of the Spearman-Karber estiinate was derived by Cornfield 
a;id Mantel (1950, p- 208)* The procedure they suggested for estimating it 
can be described as follows. Let p^ represent the proportion of successes 

at t,he jth written test score level, and let n^ represent the number of 

persons observed at that level. Let d represent the step size* Then the 



where the sum is over all written test score levels from which persons were 
actually measured for performance. 

Choosing the step size 

The choice of step size in the up-and-down method represents a trade-off 
between speed and precision* Larger step sizes lead more quickly to the portion 



k==l 



P (n - 1) 



variance of X. is estimated by 




If) 



of the written test score range containing x^; smaller step sizes permit 

more precise estimation of x^* The weaker the relationship between the 

written test and the performance measure, the larger the step size needed, 
and the less precise will be the resulting estimate* (Dixon and Mood, 19A8; 
Wetherlll, 1963; Dixon, 1965; Davis, 1971)* Brownlee, et al (1953) suggested 
using large steps as long as only successes or only failures are observed^ 
then switching to small steps with the first change of performance* Wetherlll 
(1963, 1975) suggested a more general version of this method: use large steps 
until some specified nuiober of changes of performance (runs of successes or 
failures) have been observed; then compute and begin again at this Input 

(written test score) level, using smaller steps to produce a more precise 
estimate* 

Stopping rules for the up^-and-down method 

The choice of a stopping rule will often be dictated by economic, 
rather than statistical considerations* The test user may have to specify 
his sample size before beginning to collect performance data* However, in 
many cases it may be possible to let the number of observations depend on 
the data, at least within limits, Brownlee, et al (1953) recommend taking 
a specified number of observations beyond the initial run of successes or 
failures, Wetherlll and Levitt (1965) recommend stopping after a specified 
nuirJber of runs of successes or failures (l*e,, a specified number of "peaks*' 
and "valleys")* Another possibility is to compute the estimated variance 
of the estimate after each observation (or after each run of successes or 
failures) • When this variance becomes less than a specified size, stop 
takin;: observations ^ The ideal method for choosing sample size would be 
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an application of decision theory, taking Into account (at any stage of the 
procedure) the costs of additional performance measurement and the benefits 
of Increased preclalon* However, the resulting computatlona might be cum- 
bersome* 

Variations of the up-and-down method for P ^ *S0 * 

Since the basic up-and-down method leads to the selection of persons 
with written test scores correapondlng to a 50 per cent probability of suc- 
cess on the performance meaaure, it la not well suited to estimating the 
written test score corresponding to a probaf*xty of success^ other than .50* 
However, there are a number of variations of the method which make it suitable 
for this more general situation. Derman (1957) auggeated a probabilistic 
method that can be described as follows. If p > 1/2, move up after any 
failure, but after a success, move down with probability l/(2p) and up 
with probability (2p - l)/(2p). Thua, the higher the value of Pj the leas 
the probability of moving down after a succeaa. That is, the higher 
probability of success we require, the more we will concentrate on persons 
with high written test scores* Conversely, if p<l/2, move down after any 
success, but after a failure, move up with probability 1/(2 - 2p) and down 
with probability (1 * 2p)/(2 - 2p) . The estimate of for Derman's procedure 

ia simply the written test score that occurs most frequently (or, if there 
are two or more such scores, their average)* 

Wetherlll (1963, p. 35) suggeated that Derman*s probablllatic technique 
would be ^'likely to produce some inefficiency in small samples,** and discussed 
some alternative variations of the up-and-down method* One variation which 
he did not r^xommend v'as the obvious device of moving up more than one step 
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after a failure but down only one step after a success (for p>l/2; vice 
versa for p< l/2) * His objections to this method uere that it would lead 
to substantially biased estimates of and that the written test scores 

of persons selected would not be closely grouped around the true population 
value of * 

Wetherill (1963) did suggest two other variations of the up*-and-*down 
method which he considered preferable to either of the two variations 
described above. The first of these is as follows: After each observation 
on the performance nieasure compare the proportion of successes at that level 
(call it Pj) with p, the required probability of success* If p^ > p, move 



down; if p^ < p, move up; if p^ = p, rentain at that level. 
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Wetherill^s second suggested variation is one which he calls the **up*- 
and-down transformed response rule" (Wetherill and Levitt, 1965; Wetherill, 
1975) . This variation requires the experimenter to choose a rule such that 
when the probability of success at a given level equals the desired prob- 
ability {not necessarily *50) , the probability of nMDving up is exactly 
equal to- the probability of moving down. Tl\e rule is started anew after 
each change of levels. For example, consider the rule: ^*Move up after any 
failure; move down after two successes/* This rule allows only three possible 
sequences before changing levels* If p^ is the true probability of success 

at level j, the possible sequences, with their associated probabilities and 
results, are the following: 

Result 



Sequence 

SS 

s? 

F 



Probability 

2 

Pj(l - Pj) 
1 - P. 
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move down 
move up 
move up 
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For this rule* the probability of moving up equals the probability of 

2 

moving dovm when p^ *50; that is, when p^ = *71, Therefore this rule 
would be appropriate for estimating when p *71* One obvious limita- 
tion of this variation is that It offers the decisionmiaker a limited number 
of different choices of p for which the rule is reasonably simple. However, 
this limitation does not seem too severe in fields such as education, where 
measurement is not extremely precise* Table 3 lists up-and-down transformed 
response rules corresponding to several different values of p* 

For estimating x^ by means of the up-and-down transformed response 

rule, any of the four estimation procedures discussed previously would seet 
to apply, with the following revisions: Instead of counting individual 
respond 3S, count sequences of responses at the same level* For failure'*, 
substitute "sequence leading to a move up**; for **success**, substitute 
"sequence leading to a move down**. For example, a **peak** in Wetherill*s 
peaks -and -valleys procedure would be redefined as any sequence leading to 
a move down which was preceded by a sequence leading to a move up* In the 
Spearman^Karber estimate, p^ would be the proportion of sequences at level J 

which led to a move down, and so on. 
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The ''multiple-sample up-and-down method'* (Hsi, 1969) is a generalized 
form of the up-and-down method. The rule can be stated as follows; At each 
input level, take response measures on k persons. If s or fewer succeed, 
move up* If r or more succeed, move down, * Otherwise remain at the same 
input level. Of course, r must be greater than s* The basic up-and-down 
method can be described in this form by the values k = 1; s = 0; r = 1* 
When the desired success probability is *50, the three values will be 
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Table 3. Up-and-down transfonaed response rules for estimating wrltter 
test scores corresponding to selected probabilities of success. 



.50 
.60 
.71 
.79 
.84 
.87 
.89 



Move up after 
F 

F or SFF 
any F 
any ,F 
any F 
any F 
any F 



Move down after 
S 

SS or SFS 
SS 
SSS 
SSSS 

sssss. 

SSSSSS 



.40 
.29 
.21 
.16 
.13 
.11 



FF or SFS 
FF 
FFF 
FFFF 
FFFFF 
FFFFFF 



S or FSS 
any S 
any S 
any S 
any S 
any S 
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related by the expression r + s = k* For success probabilities greater than 
*50, r + s > k; for success probabilities less than *50, r + s < k* The 
estimate of Is Brownlee^s sample-average estimate* 

The Robblns^Monro Process 

The Robblns-Monro process was devised for use with a continuous response 
variable (performance measure) and a continuous Input variable (written test 
score). It does aot require the test user to dichotomize the response varia- 
ble (the performance measure)* For the contlnuous-^^response case, the test 
user specifies the minimum acceptable performance in terms of an expected 
score on the performance measure* Let represent this expected perfotmance 

score*---The- minimum passing writ tea.. test, score x^. Is then, defined by the 

expression 

where is specified by the test user and the symbol £ indici'tes the expec*- 
ted value* 

Notice that it is possible to use the Robblns^onro process with a 
dlchotomous response variable; in this case Y would be either 1 (for a 
success) or 0 (for a failure) an i y^ would be a specified probability of 

success* However, in thlsrcase one of the special advantages of the process 
is lost: the dependence of the step sisse on the slsse of the difference 
(Y - y^)- Empirical results with artificial data indicate that the 
Robblns-Monro process works well with a dlchotomous response variable onJy 
wben the desired success probability is close to *'>0 (Wetherlll, 1963, 
pp. 9-18)* 
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The Robblns-Honro process Is defined by the following rule for changing 
the Input: 

where the d^ are a decreasing sequence of constants such that 
^ d^ oo and T.^i ^ ^ 

These decreasing step coefficients cause the values of to converge to 

the true value of Instead of bouncing back and forth around It as In 

the up-and-down method* Therefore the estlinate of after n observations 

Is simply X - , the written test score of the student who would be selected 
n+i 

next for performance measurement* 

Robblns and Monro (1951) recommended choosing step coefficients 
according to the sequence 

C C C 

d =C; d =:r; d^==*r;***; d«-* 
1 2 2 3 3 n n 

This choice of coefficients can be justified intuitively as follows: at any 
stage of the process we have a prior estimate, based on all the previous 
observations, which we will revise on the basis of one additional observa- 
tion. If this additional observation is the nth observation, It contains 
l/n of the information we have obtained* The rest of the information Is 
contained in the prior estimate* Therefore we will weight the nth observa- 
tion only l/n as heavily as we would If It were our only piece of iriformation* 

there remains the problem of choosing a value for C, the initial ;?tep 
coefficient., the optimal choice of C depends on the slope of the response 
curve; ideally, C should be the inverse of the slope at the point x^ (Venter^ 

1967)* Howfiver, since x^ is unknown, this result Is useful only in placing 



a lower limit on C, and then only if some prior information about the response 
curve is available* The results of a simulation study by avis (1971) sug- 
gest that it is better to have a value of C that is too larse than one that 
is too small* If the response curve is a normal cum*jlative distrifi'ution 
function, a good value for C would be from two to foct times its standard 
deviation* If the shape of the response curve is completely unknown, we 
have little guidance in choosing C* 

One way to guard against the choice of too small a value for ^ is to 
use the **delayed Robbins^Monro process" (Davis, 1971), in which the step 
coefficients do not begin to decrease until there is a change of direction* 
From then on, the process continues as an ordinary Robbins-Monro process* 
For example, if the first three persons all have performance scores above 
y^ and the fourth scores below y^ the step coefficients would be C, C, C, 

C/2, C/3, 

Since the Robbins-Monro process is bvl iterative process that converges, 
the test user may want to choose a stopping rule based on this convergence 
property. For example, he may want to stop performance testing when a 
specified number o£ estimates of x^ all lie within a specified distance 

of each other* 

Variance of the Robbins-Monro estimate 

Estimauing the variance of X in the Robbins-^Monro process is a complex 

n 

problem* Tliere is an assymptotic result (Sa.cks, 1958) which states that as 
the number of observations increases, the distribution of the random variable 
\fl^ " ^i) converges to a normal distribution with mean zero and variance 

a (2«- a) 

where cT is the conditional variance of Y, given x x^ ; ot is the slope of 
of the regression of Y on X at the point X = x^ ; and a is the inverse of the 



first 3tep coefficient* Venter (1967) has suggested techniques for estimating 
and ocby using a variation of the basic Robblns^onro process* Uls sug- 
gested method Involves taking tyo observations at each step, with input values 
above and below the most recent estimate of * The distance between the two 

Input values decreases at each step, but at a slower rate than the decrease 
In the step coef f Icl^isnts * This process has the additional advantage of con- 
verging fasf^r than the basic Robblns-Monro process, but It Is somewhat more 
complicated to administer* 

Farrell (1962) devised nonparametrlc confidence interval' procedures for 
both the Robbins-Monro process and the up-and-down method* However, these 
procedures are mathematically complex and (like other nonparametrlc confidence 
interval procedures) tend to produce very wide Intervals (Fabian » personal 
communication, 1975) . 

Which method to use? 

>Jost tast users will probably find the up-and-down method (or a variation 
of it) more practical than the Robblns*-Monro process, for the reasons given 
by Wetherill (1975); 

Two difficulties arise in attempting to apply 
the Robblns*-Honro procedure to a practical 
problem. Firstly, observations must t^e taken 
serially and a calculation performed in between 
each one, which is not always convenient* 
Secondly » it is nearly always impracticable 
to stick to step sizes of C/n* 

One limitation of the up-and-down method is its lack of flexibility 
in estimating probabilities of success other than *50* The "up-and-down 
transformed response*' rule helr>s to Impart some of the needed flexibility. 



but the researcher still must choose from a fairly small selection of success 
probabilities* Howover, the choice of values given in Table 3 should be suf-* 
ficient for most applications in educational and occupational testing* 
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